NGram Approach for Semantic Similarity on Arabic Short Text

نویسندگان

چکیده

Measuring the semantic similarity between words requires a method that can simulate human thought. The use of computers to quantify and compare similarities has become an important research area in various fields, including artificial intelligence, knowledge management, information re-trieval, natural language processing. Computational seman-tics require efficient measures for computing concept similarity, which still need be developed. Several computational based on resources such as WordNet taxonomy. taxonom-ical parameters have been applied optimize expression content semantics. This paper presents new measure quantifying concepts, words, sentences, short text, long text NGram features Synonyms related same domain. proposed algorithm was tested 700 tweets, values were compared with cosine dataset. results analyzed manually by domain expert who concluded provided better than within selected regarding datasets’ texts.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Benchmarking short text semantic similarity

Short Text Semantic Similarity measurement is a new and rapidly growing field of research. “Short texts” are typically sentence length but are not required to be grammatically correct. There is great potential for applying these measures in fields such as Information Retrieval, Dialogue Management and Question Answering. A dataset of 65 sentence pairs, with similarity ratings, produced in 2006 ...

متن کامل

Text-to-Text Semantic Similarity for Automatic Short Answer Grading

In this paper, we explore unsupervised techniques for the task of automatic short answer grading. We compare a number of knowledge-based and corpus-based measures of text similarity, evaluate the effect of domain and size on the corpus-based measures, and also introduce a novel technique to improve the performance of the system by integrating automatic feedback from the student answers. Overall...

متن کامل

A Comparative Study of Two Short Text Semantic Similarity Measures

This paper describes a comparative study of STASIS and LSA. These measures of semantic similarity can be applied to short texts for use in Conversational Agents (CAs). CAs are computer programs that interact with humans through natural language dialogue. Business organizations have spent large sums of money in recent years developing them for online customer selfservice, but achievements have b...

متن کامل

ECNUCS: Measuring Short Text Semantic Equivalence Using Multiple Similarity Measurements

This paper reports our submissions to the Semantic Textual Similarity (STS) task in ∗SEM Shared Task 2013. We submitted three Support Vector Regression (SVR) systems in core task, using 6 types of similarity measures, i.e., string similarity, number similarity, knowledge-based similarity, corpus-based similarity, syntactic dependency similarity and machine translation similarity. Our third syst...

متن کامل

A distributional similarity approach to the detection of semantic change in the Google Books Ngram corpus.

This paper presents a novel approach for automatic detection of semantic change of words based on distributional similarity models. We show that the method obtains good results with respect to a reference ranking produced by human raters. The evaluation also analyzes the performance of frequency-based methods, comparing them to the similarity method proposed.

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Advanced Computer Science and Applications

سال: 2022

ISSN: ['2158-107X', '2156-5570']

DOI: https://doi.org/10.14569/ijacsa.2022.0131199